icon

Usetutoringspotscode to get 8% OFF on your first order!

AUT University Certificate in Foundation Studies

1
AUT University Certificate in Foundation Studies
Delivered by ACG Norton College
FOUNDATION STATISTICS ASSIGNMENT 2
NAME:…………………………………. ID:………………………..
Due: Thursday 26th November, 2015
Mark Scheme
1. Assignment questions 75
R Test 15
TOTAL
/90
………………………%
1. All parts of your assignment MUST be word processed. Any part
written in ink or pencil will be ignored!
2 Label all graphs appropriately, and give each graph a suitable main
title.
3 Show ALL workings and R output.
4 Round all calculations sensibly.
5 Assignments handed in late will be receive 0%
2
Section A. This section is to be completed using ONLY your calculator for the required
workings.
Question 1 [10]
(a) Ages from a group of athletes are approximately normal X ?N(33.1yrs,2.3yrs). Apply the
68-95-99.7 Rule to determine the interval in which the middle 99.7% of all ages will fall.
[2]
Using Z tables
(b) Determine what percentage of athletes’ ages will be below 31 years. [2]

(c) If a sample of 1700 ages are taken from the athletes, determine how many athletes will have
an age above 34 years? [3]
(d) Find the upper quartile for the athletes’ ages. [3]
Question 2 [25]
(a) Explain the terms non-response bias and response bias in sampling. Give an example of each, not
the same as those in your notes. [4]

(b) Name some factors that make a successful questionnaire? [2]
(c) Define the term “sampling frame”. [1]
3
(d) What is an undercount in a census? [2]
(e) 1.Describe how to use a calculator to randomly select numbers in a range from 01 to 60. [2]
2. The heights, in cm, of a community of 60 people are collected in the table below.

i) Use Table B of random digits to undertake a simple random sample to select 8
peoples’ heights from the table above. Start at the beginning of row 105 reading them
continuously from left to right across the row. Place your results below. [2]
ii) Evaluate your sample mean height. [2]
3. In the population of 60 heights, 25 are from females. Fully describe how you would
complete a stratified random sample of size 20 with respect to gender. You do not
need to do the sample. [5]
Person #
Height,
cm
4
4. A systematic sample of size 6 is to be undertaken from the population of heights. Explain
fully how a systematic sampling procedure is conducted if a random starting position at the
height numbered 16 is chosen. List the six heights selected. [3]
5. State one advantage and one disadvantage of a census of all 60 people in the community.
[2]
Advantage:
Disadvantage:
Question 3 [10]
The variable self.concept can be found in the data set EduData,
Blackboard- RData- EduData.txt
(a) Explain fully in what way the distribution for self.concept is non-Normal. You must use and
include a Normal Quantile plot. [4]
(b) What is granularity? Explain if granularity is present in this data set? [2]

5
(c) Use R Commander to provide proof that the observations of self.concept was taken from a
non-Normal distribution.
Produce another graph as well as summary statistics to form part of the proof. The graph
must have suitable labels and titles. Write a small paragraph on your findings. [4]
Section B. This section is to be completed using ONLY R Commander for the required
workings.
Question 4 [30]
a) A random sample surveyed 78 Year Five students at a large school and the researcher recorded
several variable values for each student. A linear relationship between two of the variables shown
below was investigated:
Variable Description
NSL National Standard Literacy – a numeric academic measure.
IQ Intelligence Quotient- a numeric intelligence measure.
Linear Regression output from R
Call:
lm(formula = NSL ~ IQ, data = NSL)
Residuals:
Min 1Q Median 3Q Max
-6.3182 -0.5377 0.2178 1.0268 3.5785
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.55706 1.55176 -2.292 0.0247 *
IQ 0.10102 0.01414 7.142 4.74e-10 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Residual standard error: 1.635 on 76 degrees of freedom
Multiple R-squared: 0.4016, Adjusted R-squared: 0.3937
F-statistic: 51.01 on 1 and 76 DF, p-value: 4.737e-10

i) Identify the response variable from the regression output. [1]
6
ii) The minimum residual value from the output is -6.3182. Indicate on the scatterplot which
data point this is, by circling the point. [1]
iii) Calculate the correlation coefficient for the relationship. [2]
iv) Describe the relationship between NSL level and IQ. Include any unusual features. [4]
v) The linear regression equation for estimating the NSL level from IQ of a student is
NSL = 0.101IQ – 3.557 (coefficients are rounded to 3 decimal places)
State a limitation of this model equation in predicting the NSL levels of students. [2]

vi) Interpret the gradient of the regression line equation. Include units. [3]
7
vii) Use the model equation, NSL = 0.101IQ – 3.557 to estimate the NSL level of a student with
an IQ of 115. [3]
viii) The mean of the variable IQ is 108.9. Use this result to find the mean of the variable NSL.
Explain your method. [3]

ix) If the two variables, NSL and IQ, were interchanged (swapped), explain in general the effect
on the equation of the regression line and the R-squared value for the new relationship. [3]
Regression Equation:
R2
:
x) 1) State the value of the coefficient of determination by referring to the R output. [1]
2) Explain the meaning of this value in the context of these variables. [2]
8
xi) A pilot survey of the Year Five students was undertaken before the main sampling exercise. The
results are in the table for six individuals:

1) Use your calculator to find the correlation coefficient for the relationship. [1]

2) Assuming a linear model, with IQ as the explanatory variable, the equation for the least
squares regression line for the relationship is:
? = ?. ????? – ?. ??? (coefficients rounded to 4SF)

Find the residual (prediction error) for the observed value (105, 8.4) [2]
3) Produce a scatterplot for the relationship. [2]
IQ 90 100 105 107 112 126
NSL 5.3 6 8.4 7.2 8 9.1

Responses are currently closed, but you can trackback from your own site.

Comments are closed.

AUT University Certificate in Foundation Studies

1
AUT University Certificate in Foundation Studies
Delivered by ACG Norton College
FOUNDATION STATISTICS ASSIGNMENT 2
NAME:…………………………………. ID:………………………..
Due: Thursday 26th November, 2015
Mark Scheme
1. Assignment questions 75
R Test 15
TOTAL
/90
………………………%
1. All parts of your assignment MUST be word processed. Any part
written in ink or pencil will be ignored!
2 Label all graphs appropriately, and give each graph a suitable main
title.
3 Show ALL workings and R output.
4 Round all calculations sensibly.
5 Assignments handed in late will be receive 0%
2
Section A. This section is to be completed using ONLY your calculator for the required
workings.
Question 1 [10]
(a) Ages from a group of athletes are approximately normal X ?N(33.1yrs,2.3yrs). Apply the
68-95-99.7 Rule to determine the interval in which the middle 99.7% of all ages will fall.
[2]
Using Z tables
(b) Determine what percentage of athletes’ ages will be below 31 years. [2]

(c) If a sample of 1700 ages are taken from the athletes, determine how many athletes will have
an age above 34 years? [3]
(d) Find the upper quartile for the athletes’ ages. [3]
Question 2 [25]
(a) Explain the terms non-response bias and response bias in sampling. Give an example of each, not
the same as those in your notes. [4]

(b) Name some factors that make a successful questionnaire? [2]
(c) Define the term “sampling frame”. [1]
3
(d) What is an undercount in a census? [2]
(e) 1.Describe how to use a calculator to randomly select numbers in a range from 01 to 60. [2]
2. The heights, in cm, of a community of 60 people are collected in the table below.

i) Use Table B of random digits to undertake a simple random sample to select 8
peoples’ heights from the table above. Start at the beginning of row 105 reading them
continuously from left to right across the row. Place your results below. [2]
ii) Evaluate your sample mean height. [2]
3. In the population of 60 heights, 25 are from females. Fully describe how you would
complete a stratified random sample of size 20 with respect to gender. You do not
need to do the sample. [5]
Person #
Height,
cm
4
4. A systematic sample of size 6 is to be undertaken from the population of heights. Explain
fully how a systematic sampling procedure is conducted if a random starting position at the
height numbered 16 is chosen. List the six heights selected. [3]
5. State one advantage and one disadvantage of a census of all 60 people in the community.
[2]
Advantage:
Disadvantage:
Question 3 [10]
The variable self.concept can be found in the data set EduData,
Blackboard- RData- EduData.txt
(a) Explain fully in what way the distribution for self.concept is non-Normal. You must use and
include a Normal Quantile plot. [4]
(b) What is granularity? Explain if granularity is present in this data set? [2]

5
(c) Use R Commander to provide proof that the observations of self.concept was taken from a
non-Normal distribution.
Produce another graph as well as summary statistics to form part of the proof. The graph
must have suitable labels and titles. Write a small paragraph on your findings. [4]
Section B. This section is to be completed using ONLY R Commander for the required
workings.
Question 4 [30]
a) A random sample surveyed 78 Year Five students at a large school and the researcher recorded
several variable values for each student. A linear relationship between two of the variables shown
below was investigated:
Variable Description
NSL National Standard Literacy – a numeric academic measure.
IQ Intelligence Quotient- a numeric intelligence measure.
Linear Regression output from R
Call:
lm(formula = NSL ~ IQ, data = NSL)
Residuals:
Min 1Q Median 3Q Max
-6.3182 -0.5377 0.2178 1.0268 3.5785
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.55706 1.55176 -2.292 0.0247 *
IQ 0.10102 0.01414 7.142 4.74e-10 ***
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
Residual standard error: 1.635 on 76 degrees of freedom
Multiple R-squared: 0.4016, Adjusted R-squared: 0.3937
F-statistic: 51.01 on 1 and 76 DF, p-value: 4.737e-10

i) Identify the response variable from the regression output. [1]
6
ii) The minimum residual value from the output is -6.3182. Indicate on the scatterplot which
data point this is, by circling the point. [1]
iii) Calculate the correlation coefficient for the relationship. [2]
iv) Describe the relationship between NSL level and IQ. Include any unusual features. [4]
v) The linear regression equation for estimating the NSL level from IQ of a student is
NSL = 0.101IQ – 3.557 (coefficients are rounded to 3 decimal places)
State a limitation of this model equation in predicting the NSL levels of students. [2]

vi) Interpret the gradient of the regression line equation. Include units. [3]
7
vii) Use the model equation, NSL = 0.101IQ – 3.557 to estimate the NSL level of a student with
an IQ of 115. [3]
viii) The mean of the variable IQ is 108.9. Use this result to find the mean of the variable NSL.
Explain your method. [3]

ix) If the two variables, NSL and IQ, were interchanged (swapped), explain in general the effect
on the equation of the regression line and the R-squared value for the new relationship. [3]
Regression Equation:
R2
:
x) 1) State the value of the coefficient of determination by referring to the R output. [1]
2) Explain the meaning of this value in the context of these variables. [2]
8
xi) A pilot survey of the Year Five students was undertaken before the main sampling exercise. The
results are in the table for six individuals:

1) Use your calculator to find the correlation coefficient for the relationship. [1]

2) Assuming a linear model, with IQ as the explanatory variable, the equation for the least
squares regression line for the relationship is:
? = ?. ????? – ?. ??? (coefficients rounded to 4SF)

Find the residual (prediction error) for the observed value (105, 8.4) [2]
3) Produce a scatterplot for the relationship. [2]
IQ 90 100 105 107 112 126
NSL 5.3 6 8.4 7.2 8 9.1

Responses are currently closed, but you can trackback from your own site.

Comments are closed.

Powered by WordPress | Designed by: Premium WordPress Themes | Thanks to Themes Gallery, Bromoney and Wordpress Themes